Pre x Probabilities from Stochastic Tree Adjoining Grammars
نویسندگان
چکیده
Language models for speech recognition typically use a probability model of the form Pr(anja1; a2; : : : ; an 1). Stochastic grammars, on the other hand, are typically used to assign structure to utterances. A language model of the above form is constructed from such grammars by computing the pre x probability P w2 Pr(a1 anw), where w represents all possible terminations of the pre x a1 an. The main result in this paper is an algorithm to compute such pre x probabilities given a stochastic Tree Adjoining Grammar (TAG). The algorithm achieves the required computation in O(n6) time. The probability of subderivations that do not derive any words in the pre x, but contribute structurally to its derivation, are precomputed to achieve termination. This algorithm enables existing corpus-based estimation techniques for stochastic TAGs to be used for language modelling.
منابع مشابه
Preex Probabilities for Linear Indexed Grammars
We show how preex probabilities can be computed for stochastic linear indexed grammars (SLIGs). Our results apply as well to stochastic tree-adjoining grammars (STAGs), due to their equivalence to SLIGs.
متن کاملPrefix probabilities for linear indexed grammars
vVe show how prefix probabilities can be computed for stochastic linear indexed grammars (SLIGs). Our results apply as weil to stochastic tree-adjoining grammars (STAGs), due to their equivalence to SLIGs.
متن کاملPrefix Probabilities from Stochastic Tree Adjoining Grammars
Language models for speech recognition typically use a probability model of the form Pr(an|a1, a2, . . . , an−1). Stochastic grammars, on the other hand, are typically used to assign structure to utterances. A language model of the above form is constructed from such grammars by computing the prefix probability ∑ w∈Σ Pr(a1 · · · anw), where w represents all possible terminations of the prefix a...
متن کاملStochastic Categorial Grammars
Statistical methods have turned out to be quite successful in natural language processing. During the recent years, several models of stochastic grammars have been proposed, including models based on lexicalised context-free grammars [3], tree adjoining grammars [15], or dependency grammars [2, 5]. In this exploratory paper, we propose a new model of stochastic grammar, whose originality derive...
متن کاملPrefix Probabilities for Linear Context-Free Rewriting Systems
We present a novel method for the computation of prefix probabilities for linear context-free rewriting systems. Our approach streamlines previous procedures to compute prefix probabilities for context-free grammars, synchronous context-free grammars and tree adjoining grammars. In addition, the methodology is general enough to be used for a wider range of problems involving, for example, sever...
متن کامل